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ABSTRACT 

We present two related techniques to measure the two-point correlation function and the power 
spectrum with edge correction in any spatial dimensions. The underlying algorithm uses fast Fourier 
transforms for calculating the two-point function with an heuristically weighted edge corrected es- 
timator. Once the correlation function is measured, we estimate the power spectrum via numerical 
integration of the Hankel transform connecting the two. We introduce an efficient numerical technique 
based on Gauss-Bessel-quadrature and double exponential transformation. This, combined with our, 
or any similar, two-point function estimator leads to a novel edge corrected estimator for power spec- 
tra. The pair of techniques presented are the Euclidean analogs of those developed and widely used 
in cosmic microwave background research for spherical maps. 

Subject headings: large scale structure — cosmology: theory — methods: statistical 



1. INTRODUCTION 

The two-point correlation function and its Fourier 
transform, the power spectrum, are the most measured 
statistics in large scale structure studies. Correlation 
functions can be measured naively with an 0(N 2 ) tech- 
nique through cycling through pairs of objects. The ad- 
vantage of that is tha t t he edge corrected estimator o f 
ILa.nriv Rr. Szalavl iTmfll or lSza.nnrli Sza.la.vl frnflll^fXt 
(c.f, Equation can be used for nearly optimal mea- 
surements. The naive counting of pairs can be sped up, 
if one is interested in measuring the correlation function 
on small scales, and reasonab ly large bins. In th at case 
the double-tree algorithm by iMoore et alJ l)200lT) scales 
approximately as 0(N log N). However, if all scales are 
considered, with high resolution, the tree-based algo- 
rithm has to split all nodes of the tree, thus it will de- 
generate into the 0(N 2 ) scaling of the naive algorithm. 

Measurement of the power spectrum can be obtained 
directly with fast Fourier transforms (FFTs). The scal- 
ing of this algorithm 0(N log N), where N corresponds 
to the dynamic range of the measurement. A particu- 
lar advantage of the FFT is that the speed is indepen- 
dent of the maximum scale (or smallest wavenumber): 
scales up to the fundamental mode can be measured to- 
gether with scales corresponding to the grid size. The 
resulting estimates, however, are convolved with the sur- 
vey window. In cosmic microwave background (CMB) 
studies, such power spectra are termed pseudo-power 
spectra, and have been entirely re placed by edge cor- 
rected (or true) po wer spectra (Ye.g.. lSzapudi et al.l200H 
iHivon et alJ l2~002)). In this paper we ask the question: 
can we take advantage of the FFT algorithms, and calcu- 
late the correlation function in 0(N log N) time even on 
large scales? Can we exploit the edge corrected estima- 
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tor for the two-point correlation function, and estimate 
true (edge-corrected) power spectra? As we show next, 
the answer is affirmative to both questions, in complete 
analogy to results already obtained in CMB research. 

In the next §2 we describe a new technique to measure 
edge corrected correlation functions using an FFT based 
algorithm, and a numerical technique to obtain edge cor- 
rected power spectra. We present numerical results in §3, 
and summarize and discuss them in the last §4. 

2. DESCRIPTION OF THE TECHNIQUE 

It is well known that the power spectrum is the Fourier 
transform of the two-point correlation function (Wiener- 
Khinchin theorem). In cosmic microwa ye background 
resear ch, this basic idea was exploited bv lSzaoudi et alJ 
( 2001) to obtain a fast method for estimating the angu- 
lar power spectrum C/s with fast harmonic transforms. 
A crucial intermediate step i n their method was to es- 
timate correlation functions l)Szapudi et alJ 12001). Al- 
though th^^indow function can be inverted in harmonic 
space ijHivon et al J 120021) . edge effect correction can be 
trivially obt ained from the estimator iSzapudi fc Szalavl 
l)1998l 12000(1 . This correspond to inverting the window 
matrix in pixel space, where it is diagonal. Our plan is to 
adapt this method for Euclidean space, arbitrary dimen- 
sions. The resulting technique is naturally divided into 
two steps: i) fast calculation of the two-point correlation 
function via FFT ii) calculation of the power spectrum 
via a numerical Hankel transform. 

2.1. Two-point correlation function 

The minimum variance estimator of lSzapudi fc Szalavl 
( 1998) is most natural when defined on a grid. If Si is the 
density field on grid point i, we can define the estimator 

as 
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where is the pair weight, and we assumed that 
(Si) = 0. Naive estimation with general pair weight is 
0(N 2 ), where N is the number of pixels. As we will see, 
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fast estimation is possible in the special case of the pair 
weight, where /y = fifjS^j A , i.e. the weight is mul- 
tiplicative, and depends only on a shift A. Here Sj^ r is 
the Kronecker-(5, taking values of 1 when k = I, and 
otherwise. The simplest example is flat weighting fa = 1. 

The correlation function without the normalization is a 
"raw-correlation" function of unnormalized counts. The 
normalization for an arbitrary complex geometry can be 
obtained by calculating the raw correlation function of 
the geometry, putting l's into valid grid points, and O's 
everywhere else. While our notation suggests D = 1 di- 
mensions, in fact arbitrary dimension is included, sim- 
ply by replacing indices, say i, with tuples of indices 
(ii, . . .,»£>). _ 

Computationally, one needs to calculate pair summa- 
tions of the form ^OiOi+A- These be reformulated to 
make use of Fast Fourier Transforms (FFTs), one of 
fastest algorithms in existence. If P(x) = cio + ciix + . . .+ 
a„_ia; n_1 is a polynomial, and e = e 27Tl ^ r unit roots, the 
coefficients of a discrete Fourier transform of the series 
di can be defined as 

a k = P(e k ) = a Q + ai e k + ... + a n ^ n -^ k . (2) 
Direct calculation confirms that ^2 o/c^fe+A can be calcu- 
lated by Fourier transforming the series Oi and 6^, mul- 
tiplying the resulting a,kb* kl and finally inverse Fourier 
transforming back. This simple observation is the dis- 
crete analog of the Wiener-Khinchin theorem. The 
Fourier space quantity arising in the above algorithm is 
the discrete, Euclidean "pseudo-power spectrum" . Note 
that in fact all previ ous direct mea s ureme nts of the 
power spectrum using IFeldman et alJ l) 19941) type esti- 
mators measured pseudo power spectrum. We will see in 
the next subsection, how to obtain the (edge corrected) 
power spectrum instead from the correlation function. 

Using the above observation, we can put together a 
fast algorithm to calculate two-point correlation function 
on in D dimensions with the following steps: i) placing 
the points in a sufficiently fine grid, storing the value 
iVk, the number of objects (e.g. galaxies) at (vector) 
grid point k, (this step is omitted if the density field 
is a given, such as Euclidean approximation of CMB 
maps, etc.) ii) calculating fluctuations of the field by 
5 = (N — (N))/(N) for each grid point iii) possible 
weighting each point with a minimum var iance weight 
(e.g. J3 weighting, or IFeldman et all l|1994|) weighting), 
iv) discrete Fourier transform with a fast FFT engine, v) 
multiplying the coefficients vi) Fourier transform back. 
The same procedure is followed for the mask with zero 
padding large enough (i.e. larger than the largest bin of 
the correlation function) to avoid aliasing effects. Finally, 
the raw correlation function is divided with the correla- 
tion function of the mask, according to Equation ^ 

The result will be an inhomogeneous correlation func- 
tion £(r), depending on vector shifts r. To obtain the 
traditional correlation function one has sum the result in 
rings/spheres for D — 2, 3, respectively. 

2.2. Power Spectrum 

The power spectrum is a D-dimensional Fourier trans- 
form of the two-point correlation function. Using rota- 
tional invariance, this will become a Hankel transform in 
D = 2, 3 dimensions: 

e(r) = /^P(fe)Jo(fcr) 2D, 



e(r) = /^P(*)io(*r)3D, (3) 
where Jo and jo are ordinary and spherical Bessel func- 
tions, respectively. As well known the Hankel transform 
is self inverse, and our plan is to use the inverse of the 
above formulae to obtain an estimator of the power spec- 
trum. 

In practice, it is a delicate numerical procedure to 
calculate a Hankel tran sform numeri cally. We use the 
quadrature formulae of iQgatal i|2005f) . which uses both 
the roots of Bessel functio ns, as well as the dou ble ex- 
ponential transformation of lOoura Sz Moril l)1999f) . Since 
these techniques are fairly new developments in numeri- 
cal mathematics, we summarize the recipe here. Explic- 
itly, to integrate over an arbitrary function / multiplied 
with a Bessel function we use the formula 

/>oo 

/ f(x)J v {x) ^TrYX=x w vkf(^i>{hr V k)/h) x 
Jo 

J u (^{hr vk )/h)^{hr vk ), (4) 

where r vk are the roots of the Bessel function J„, ijj{t) = 
t tanh( ^ sinh t) the double exponential transformation, 
h is the step of the integration (similarly to that of the 
trapezoidal rule). The weights are calculated as w vk = 
2/(ir 2 rvkJv+i( nr vk))- For the inverting the two-point 
function v = and v = 1/2 Bessel functions should be 
used for 2 and 3 spatial dimensions according to EqsEl 

3. RESULTS 
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Fig. 1. — The tw o-point correlation function measured with the 
IMoore et al. (2001) tree-based algorithm (black symbols with er- 
rorbars) vs. our Fourier based approach (red/gray) in a set of 
VLS simulations (see description in text). The agreement is vir- 
tually perfect, despite the difference in the methods and binning. 
For comparison, we also show a separate measurement of the two- 
point function where (erroneously) periodic boundary conditions 
were assumed (solid line). This can be considered as a measure- 
ment without proper edge correction. 

The above algorithm to measure the correlation func- 
tion was implemented in C using the FFTW3 package for 
two and three dimensions. The two-dim ensional version 
of the algorithm was extensively tested infBudavari et all 
(2003), the first application to measure the angular cor- 
relation function in the SDSS. In what follows, we focus 
on three-dimensions. 
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To test our algorithm for the two-point correlation 
function, and the power spectrum, we performed mea- 
surements in ACD M simulations by the V irgo Supercom- 
puting Consortium Uenkins et all l)1998f) . We used out- 
puts of VLS (Very Large Simulation) with the following 
cosmological parameters: il m = 0.3, f2„ = 0.7, T = 0.21, 
h = 0.7 and as = 0.9. In order to estimate measurement 
errors, we divided the VLS simulation into eight inde- 
pendent subsets, thus each subset was (239.5h _1 Mpc) 3 
cube. Note that thus sub-cubes are not subject to peri- 
odic boundary conditions, thus, although their geometry 
is simpler than a typical galaxy survey, they are subject 
to edge effects, unlike a full simulation cube. 

We have measured the correlation function in logarith- 
mic bins in the range of 0.3 — 162h _1 Mpc with our imple- 
mentation of the algorithm eSpICE (Euclidean version of 
SpIC E, the spatial l y inh omogeneous correlation estima- 
tor, ISzaEudOOD 1)2001)) ) Measurement of the correla- 
tion function on a 768 3 grid takes about 15 minutes on 
one Opteron processor. In ad dition, we also perfo rmed 
control measurements with the lMoore et alJ 1)2001)) tree- 
based algorithm in logarithmic bins from 0.1 — 38h~ 1 Mpc 
scales. The required CPU time of this latter algorithm 
increases drastically on large scales, hence we stopped 
at smaller scales. Even though the binning of the two 
runs were not identical, the two algorithms produced 
virtually identical curves as shown on Figure ^ I n addi- 
tion, we have measured the correlation function assum- 
ing -erroneously, according to our previous discussion- 
periodic boundary conditions. The difference between 
the results based on periodic and non-periodic bound- 
ary conditions demonstrate that edge effects are already 
significant on 40h~ 1 Mpc scales in a 239.5h _1 Mpc size 
simulation. 

The next step was calculating the edge corrected power 
spectrum. First, we have performed a cubic spline inter- 
polation on the measured correlation function, to turn 
the discrete points into a function, which can be sam- 
pled at the roots of Bessel functions. Then we applied 
the recipe of Equation^lfor v = 1/2 to integrate the first 
of Eqs. |2I We used h = 1/32 and truncated the sum 
at N — 200; doubling these values made no significant 
effect on our results. We have checked the accuracy or 
our integration routine with N = 10000 point integra- 
tion acc ording to the d i rect d ouble exponential integra- 
tion bv lOoura fc Mora l)1999)K and with Mathematica. 
This showed that this fast 200 point integration is bet- 
ter than 0.1% accurate, which was good enou gh for our 
purp oses. Note that with some care, FFTLog l)Hamiltonl 
2000) could probably adapted to the same purpose, al- 
though the aliasing and numerical effects are more deli- 
cate. 

The results are shown on Figure [5] We have applied 
the inversion both to the results from eSpICE, and from 
the tree based algorithm. We used a scale range conser- 
vatively as (Awn.fcmax) = 27r/r max , l/r m i n . In addition 
we plo tted the linear theory and non-linear iSmith et al.l 
( 2003) power spectra. The agreement of our reconstruc- 
tion with the phenomenology is nearly perfect. 

4. SUMMARY AND DISCUSSIONS 

We have presented a fast Fourier based algorithm, 
which is compleme ntary to the tree-based algorithm of 
iMoore et all 1)20011) . While the latter is ideal for mea- 
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Fig. 2. — The power spectrum extracted from the two-point 
correlation function using the method described in t he text. The 
green and red lines show the linear and non-linear Smit rTet alJ 
(2003) power spectra. The other symbols and lines correspond to 
Figure |2 



suring the correlation function on small scales, taking 
advantage of a KD-tree construction, our grid based ap- 
proach is ideal for large scales. Our algorithm has the 
advantage that it does not slow down on large scales, al- 
though the grid has a smoothing effect on the correlation 
function on scales a few times the grid spacing. 

An interesting side effect of our algorithm is that as 
an intermediate step it produces inhomogeneous corre- 
lation functinos of vector sh ifts £(r). These has been 
used in lBudavari et al.1 l)2003f) to filter out systematic ef- 
fects from the drift scanning operations of the camera. In 
three dimensions, this could be used for estimation of the 
rcdhisft space correlation function in the distant observer 
approximation, for narrow, deep redshift surveys. 

We also presented a method to turn a measured corre- 
lation function into an edge corrected power spectrum. 
This is, apart from numerical details, a di rect Euclidean 
genera lization of the spherical method by ISzaoudi et alJ 
( 2001). This new technique is independent on the details 
of how the two-point function is estimated: our Fourier 
based approach, a tree-based algorithm, or even a direct 
N 2 algorithm can be used to measure the two-point func- 
tion. The inversion for the power spectrum should work 
just the same. In the present work we did not correct for 
grid effects in the inverted power spectrum, but it should 
scale as W(k) 2 = sinc(k) 2D in D dimensions. 

Were we measuring the correlation function in linear 
scales (a built in option for eSpICE which does not in- 
fluence the speed), we would have obtained P(k) in lin- 



ear bins. The maximum 



at which the correlation 



function is measured, determines the resolution, and the 
smallest /c m i n — A/c ~ l/^max for P(k), while the res- 



olution and the smallest 



at which the correlation 



function is measured, determine fc max — l/^min- 

I accordance with the overwhelming majority of prac- 
tical measurements in the past, we choose to measure 
the correlation function in logarithmic bins. As a result 
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we obtained P(k) in approximately logarithmic bins in 
k. To see this, one has to note that a fixed resolution 
corresponding to a particular k in the power spectrum, 
corresponds to a maximum r at which the correlation 
function is measured at this resolution. This maximum 
translates into an effective resolution Ak at this particu- 
lar k. This qualitative argument can be made more rig- 
orous with calculation of the effective window function if 
needed. 

Large scale structure studies, up until now, have esti- 
mated "pseudo power spectra" (a term borrowed from 
cosmic microwave background research) i.e. one con- 
volved with the survey geometry. The potential of de- 
tecting baryonic oscillations in the measurement of the 
power spectrum motivates the need for measuring power 
spectra with the highest resolution possible. In partic- 
ular, if one of the survey dimensions is much smaller 
than the other two, the effective convolution kernel will 
be much wider than desirable. In the above we demon- 
strated that the power spectrum can be recovered from 
steps not any more complicated than the measurement of 
the corresponding pseudo power spectrum. While the in- 
form ation content can not be improved with the inversion 
(e.e. lEfstathioull2004[) . it is immensely useful when com- 
paring measurements from different surveys, from differ- 
ent geometries, as it has been common in CMB research. 

Note that the present technique can be used to esti- 



mate CiS in the flat sky approximation, using the asymp- 
totic behavior of the Legendre polynomials Pi(cos9) ~ 

cos((7 + 1)6* — 7r/4), and the Bessel functions 



■it£ sin 9 



Jq(z) ~ y Tz cos ( z — 7r /^)- The correlation function can 
be measured with eSpICE, and integration of Equation^] 
gives the GYs with approximately k ~ £ + 1/2. Gen- 
eralization of the proposed computational and inversion 
techniques to gravitational lensing follows directly from 
the generalization SpICE to CMB polarization, such as 
iChon et al.l p0041. Finally, generalization of Equation|31 
for a relatio n between the three-point function and the 
bispectrum l)Szapudl2004|) together with a fast algorithm 
to measure the three-point function yields a new, edge 
corrected method to measure the bispectrum. These gen- 
eralizations will be published elsewhere. 

Our eSpICE implementation will be made public upon 
acceptance of this paper. 
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